Search CORE

38 research outputs found

Non-parametric Bayesian modelling of digital gene expression data

Author: Gough Julian
Vavoulis Dimitrios V.
Publication venue: 'OMICS Publishing Group'
Publication date: 17/01/2013
Field of study

Next-generation sequencing technologies provide a revolutionary tool for generating gene expression data. Starting with a fixed RNA sample, they construct a library of millions of differentially abundant short sequence tags or "reads", which constitute a fundamentally discrete measure of the level of gene expression. A common limitation in experiments using these technologies is the low number or even absence of biological replicates, which complicates the statistical analysis of digital gene expression data. Analysis of this type of data has often been based on modified tests originally devised for analysing microarrays; both these and even de novo methods for the analysis of RNA-seq data are plagued by the common problem of low replication. We propose a novel, non-parametric Bayesian approach for the analysis of digital gene expression data. We begin with a hierarchical model for modelling over-dispersed count data and a blocked Gibbs sampling algorithm for inferring the posterior distribution of model parameters conditional on these counts. The algorithm compensates for the problem of low numbers of biological replicates by clustering together genes with tag counts that are likely sampled from a common distribution and using this augmented sample for estimating the parameters of this distribution. The number of clusters is not decided a priori, but it is inferred along with the remaining model parameters. We demonstrate the ability of this approach to model biological data with high fidelity by applying the algorithm on a public dataset obtained from cancerous and non-cancerous neural tissues

arXiv.org e-Print Archive

CiteSeerX

Structural and non-coding variants increase the diagnostic yield of clinical whole genome sequencing for rare diseases

Author: Allroggen Holger
Ansorge Olaf
Babbs Christian
Banka Siddharth
Baños-Piñero Benito
Beeson David
Ben-Ami Tal
Bennett David L.
Bento Celeste
Blair Edward
Brasch-Andersen Charlotte
Bull Katherine R.
Calpena Eduardo
Camps Carme
Cario Holger
Cilliers Deirdre
Conti Valerio
Dacal Beatriz Diez
Davies E. Graham
Dhalla Fatima
Dong Yin
Dreau Helene
Dunford James E.
Ferla Matteo
Giacopuzzi Edoardo
Guerrini Renzo
Harris Adrian L.
Hartley Jane
Hashim Mona
Hashimoto Akiko
Hollander Georg
Hughes Jim R.
Javaid Kassim
Kaisaki Pamela J.
Kane Maureen
Kelly Deirdre
Kelly Dominic
Kesim Yesim
Kini Usha
Knight Samantha J. L.
Kreins Alexandra Y.
Kvikstad Erika M.
Lange Lukas
Langman Craig B.
Lester Tracy
Lines Kate E.
Lord Simon R.
Lu Xin
Lunter Gerton
Mansour Sahar
Manzur Adnan
Maroofian Reza
Marsden Brian
Mason Joanne
McGowan Simon J.
Mei Davide
Mlcochova Hana
Murakami Yoshiko
Németh Andrea H.
Okoli Steven
Ormondroyd Elizabeth
Ousager Lilian Bomme
Pagnamenta Alistair T.
Palace Jacqueline
Patel Smita Y.
Pentony Melissa M.
Popitsch Niko
Pugh Chris
Rad Aboulfazl
Ragoussis Vassilis
Ramesh Archana
Riva Simone G.
Roberts Irene
Roy Noémi
Salminen Outi
Sanders Edward
Schilling Kyleen D.
Schuh Anna H.
Schwessinger Ron
Scott Caroline
Sen Arjune
Smith Conrad
Stevenson Mark
Taylor Jenny C.
Taylor John M.
Thakker Rajesh V.
Twigg Stephen R. F.
Uhlig Holm H.
van Wijk Richard
Vavoulis Dimitrios V.
Vona Barbara
Wall Steven
Wang Jing
Watkins Hugh
Wilkie Andrew O. M.
Yu Jing
Zak Jaroslav
Publication venue
Publication date: 09/11/2023
Field of study

BACKGROUND: Whole genome sequencing is increasingly being used for the diagnosis of patients with rare diseases. However, the diagnostic yields of many studies, particularly those conducted in a healthcare setting, are often disappointingly low, at 25-30%. This is in part because although entire genomes are sequenced, analysis is often confined to in silico gene panels or coding regions of the genome.METHODS: We undertook WGS on a cohort of 122 unrelated rare disease patients and their relatives (300 genomes) who had been pre-screened by gene panels or arrays. Patients were recruited from a broad spectrum of clinical specialties. We applied a bioinformatics pipeline that would allow comprehensive analysis of all variant types. We combined established bioinformatics tools for phenotypic and genomic analysis with our novel algorithms (SVRare, ALTSPLICE and GREEN-DB) to detect and annotate structural, splice site and non-coding variants.RESULTS: Our diagnostic yield was 43/122 cases (35%), although 47/122 cases (39%) were considered solved when considering novel candidate genes with supporting functional data into account. Structural, splice site and deep intronic variants contributed to 20/47 (43%) of our solved cases. Five genes that are novel, or were novel at the time of discovery, were identified, whilst a further three genes are putative novel disease genes with evidence of causality. We identified variants of uncertain significance in a further fourteen candidate genes. The phenotypic spectrum associated with RMND1 was expanded to include polymicrogyria. Two patients with secondary findings in FBN1 and KCNQ1 were confirmed to have previously unidentified Marfan and long QT syndromes, respectively, and were referred for further clinical interventions. Clinical diagnoses were changed in six patients and treatment adjustments made for eight individuals, which for five patients was considered life-saving.CONCLUSIONS: Genome sequencing is increasingly being considered as a first-line genetic test in routine clinical settings and can make a substantial contribution to rapidly identifying a causal aetiology for many patients, shortening their diagnostic odyssey. We have demonstrated that structural, splice site and intronic variants make a significant contribution to diagnostic yield and that comprehensive analysis of the entire genome is essential to maximise the value of clinical genome sequencing.</p

University of Birmingham Research Portal

The University of Manchester - Institutional Repository

Whole-genome sequencing of chronic lymphocytic leukemia identifies subgroups with distinct biological and clinical features

The value of genome-wide over targeted driver analyses for predicting clinical outcomes of cancer patients is debated. Here, we report the whole-genome sequencing of 485 chronic lymphocytic leukemia patients enrolled in clinical trials as part of the United Kingdom's 100,000 Genomes Project. We identify an extended catalog of recurrent coding and noncoding genetic mutations that represents a source for future studies and provide the most complete high-resolution map of structural variants, copy number changes and global genome features including telomere length, mutational signatures and genomic complexity. We demonstrate the relationship of these features with clinical outcome and show that integration of 186 distinct recurrent genomic alterations defines five genomic subgroups that associate with response to therapy, refining conventional outcome prediction. While requiring independent validation, our findings highlight the potential of whole-genome sequencing to inform future risk stratification in chronic lymphocytic leukemia

University of Liverpool Repository

DGEclust: differential expression analysis of clustered count data

Author: A Frazee
A Fritsch
A Oshlack
AJ Severin
B Graveley
B Langmead
C Soneson
CE Rasmussen
D Jiang
Dimitrios V Vavoulis
DJ McCarthy
DM Witten
DV Vavoulis
H Ishwaran
H Li
H Takahashi
H Wu
J Lu
J Sethuraman
JH Bullard
JK Pickrell
Julian Gough
K Kim
K Yeung
KA Sohn
L Cai
L Wang
L Wang
LW Hillier
M Medvedovic
MA Dillies
MA Dillies
MA Newton
Margherita Francescatto
MB Eisen
MCP de Souto
MD Robinson
MD Robinson
MI Love
N Wang
P Carninci
P Hammer
P Kirk
P Li
Peter Heutink
PJ Green
PL Auer
PL Auer
RJ Cho
RS Savage
S Anders
S Datta
S Srivastava
T Lassmann
T Shiraki
TJ Hardcastle
U Nagalakshmi
VE Velculescu
VG Cheung
VM Kvam
W Shannon
Y Katz
YW Teh
Z Sun
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Consistency management with repair actions

Author: Aston John A. D.
Feng Jianfeng
Straub Volko A.
Vavoulis Dimitrios V.
Publication venue: IEEE Computer Society
Publication date: 01/05/2003
Field of study

Comprehensive consistency management requires a strong mechanism for repair once inconsistencies have been detected. In this paper we present a repair framework for inconsistent distributed documents. The core piece of the framework is a new method for generating interactive repairs from full first order logic formulae that constrain these documents. We present a full implementation of the components in our repair framework, as well as their application to the UML and related heterogeneous documents such as EJB deployment descriptors. We describe how our approach can be used as an infrastructure for building higher-level, domain specific frameworks and provide an overview of related work in the database and software development environment community

arXiv.org e-Print Archive

CiteSeerX

Crossref

Directory of Open Access Journals

UCL Discovery

PubMed Central

Warwick Research Archives Portal Repository

Leicester Research Archive

FigShare

Simultaneous estimation of hidden model states (including intracellular calcium concentrations) and maximal conductances in a two-compartment model of a vertebrate motoneuron (II).

Author: Dimitrios V. Vavoulis (180603)
Jianfeng Feng (48703)
John A. D. Aston (180611)
Volko A. Straub (180605)
Publication venue
Publication date
Field of study

Inference of maximal conductances and noise parameters during fixed-lag smoothing. (A) The standard deviations of the observation (Ai) and the intrinsic (Aii) noise at the soma and the dendrite. (B) Inferred maximal conductances of the sodium and potassium currents at the soma (Bi), of the N-type calcium current and the calcium-activated potassium current at the soma (Bii), of the calcium-activated potassium current at the dendrite (Biii) and of the N-type and L-type calcium currents at the dendrite (Biv). In all cases, parameter expectations gradually converged towards the true parameter values (dashed lines) after less than . The grey lines in Aii, Biii and Biv correspond to estimated parameters, when current was injected in the soma only. In these simulations, , , and the prior interval for was .</p

FigShare

True and estimated values and prior intervals used during smoothing for all parameters in the two-compartment conductance-based model.

Author: Dimitrios V. Vavoulis (180603)
Jianfeng Feng (48703)
John A. D. Aston (180611)
Volko A. Straub (180605)
Publication venue
Publication date
Field of study

1These parameter values were estimated when we used the broad prior intervals (see <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002401#pcbi-1002401-g011" target="_blank">Fig. 11Ai</a>).2Values in bold indicate the narrow prior intervals we used for generating <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002401#pcbi-1002401-g011" target="_blank">Figs. 11Aii, 11B, 11C</a> (and Supplementary <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002401#pcbi.1002401.s004" target="_blank">Figs. S4</a> and <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1002401#pcbi.1002401.s005" target="_blank">S5</a>).</p

FigShare